41 research outputs found

    Grammar Sharing Techniques for Rule-Based Multilingual NLP Systems

    Get PDF
    Proceedings of the 16th Nordic Conference of Computational Linguistics NODALIDA-2007. Editors: Joakim Nivre, Heiki-Jaan Kaalep, Kadri Muischnek and Mare Koit. University of Tartu, Tartu, 2007. ISBN 978-9985-4-0513-0 (online) ISBN 978-9985-4-0514-7 (CD-ROM) pp. 253-260

    Linguistic representation of Finnish in the medical domain spoken language translation system

    No full text
    This paper describes the development of Finnish linguistic resources for use in MedSLT, an Open Source medical domain speech-to-speech translation system. The paper describes the collection of medical Finnish corpora, the creation of a Finnish grammar by adapting the original English grammar, the composition of a domain specific Finnish lexicon and the definition of interlingua to Finnish mapping rules for multilingual translation. It is shown that Finnish can be effectively introduced into the existing MedSLT framework and that despite the differences between English and Finnish, the Finnish grammar can be created by manual adaptation from the original English grammar. Regarding further development, the initial evaluation results of English-Finnish speech-to-speech translation are encouraging

    Efficient development of grammars for multilingual rule-based applications

    No full text
    Les applications de traitement automatique des langues (TALN) fondées sur des règles, tels que les systèmes de traduction automatique, utilisent des grammaires, qui doivent être développées pour de nombreuses langues. La création de grammaires multilingues est, cependant, un processus long et laborieux, qui requiert un grand nombre de connaissances. Dans cette thèse, nous examinons la possibilité de faciliter et d'accélérer le développement de grammaires utilisées dans les systèmes de TALN multilingues. Pour ce faire, nous proposons une grammaire paramétrable pouvant être partagée par des langues typologiquement différentes, l'anglais, le finnois et le japonais. La grammaire est développée avec la plate-forme Regulus (Rayner et al., 2006). Les grammaires Regulus sont des grammaires d'unification, qui peuvent être utilisées dans différentes tâches de TALN, telles que la reconnaissance vocale, l'analyse et la génération. Nous avons intégrée à cette plate-forme le module externe de développement d'ontologies Protégé (Protege, 2009)

    Linguistic representation of Finnish in a lomited domain speech-to-speech translation system

    No full text
    This paper describes the development of Finnish linguistic resources for use in MedSLT, an Open Source medical domain speech-to-speech translation system. The paper describes the collection of the medical sub-domain corpora for Finnish, the creation of the Finnish generation grammar by adapting the original English grammar, the composition of the domain specific Finnish lexicon and the definition of interlingua to Finnish mapping rules for multilingual translation. It is shown that Finnish can be effectively introduced into the existing MedSLT framework and that despite the differences between English and Finnish, the Finnish grammar can be created by manual adaptation from the original English grammar. An initial evaluation of English to Finnish speech-to-speech translation is also presented

    Comparing Speech Recognizers Derived from Mono- and Multilingual Grammars

    No full text
    This paper examines the performance of multilingual parameterized grammar rules on speech recognition. We present a performance comparison of two different types of Japanese and English grammar-based speech recognizers. One system is derived from monolingual grammar rules and the other from multilingual parameterized grammar rules. The latter one uses hence the same grammar rules for creation of the language models for these two different languages. We carried out experiments on speech recognition of limited domain dialog application. These experiments show that the language models derived from multilingual parameterized grammar rules (1) perform equally well on both tested languages, on English and Japanese, and (2) that the performance is comparable with the recognizers derived from monolingual grammars that were explicitly developed for these languages. This suggests that the sharing grammar resources between different languages could be one solution for more efficient development of rule-based speech recognizers

    Multilingual Grammar Resources in Multilingual Application Development

    No full text
    Grammar development makes up a large part of the multilingual rule-based application development cycle. One way to decrease the required grammar evelopment efforts is to base the systems on multilingual grammar resources. This paper presents a detailed description of a parametrization mechanism used for building multilingual grammar rules. We show how these rules, which had originally been designed and developed for typologically different languages (English, Japanese and Finnish) are applied to a new language (Greek). The developed shared grammar system has been implemented for a domain specific speech-to-speech translation application. A majority of these rules (54%) are shared amongst the four languages, 75% of the rules are shared for at least two languages. The main benefit of the described approach is shorter development cycles for new system languages
    corecore